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Abstract. Periodicity analysis of unevenly collected data 
is a relevant issue in several scientific fields. In astro- 
physics, for example, we have to find the fundamental pe- 
riod of light or radial velocity curves which are unevenly 
sampled observations of stars. Classical spectral analysis 
methods are unsatisfactory to solve the problem. In this 
paper we present a neural network based estimator system 
which performs well the frequency extraction in unevenly 
sampled signals. It uses an unsupervised Hebbian non- 
linear neural algorithm to extract, from the interpolated 
signal, the principal conrponents which, in turn, are used 
by the MUSIC frequency estimator algorithm to extract 
the frequencies. The neural network is tolerant to noise 
and works well also with few points in the sequence. We 
benchmark the system on synthetic and real signals with 
the Periodogram and with the Cramer-Rao lower bound. 
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1. Introduction 

The search for periodicities in time or spatial dependent 
signals is a topic of the uttermost relevance in many fields 
of research, from geology (stratigraphy, seismology, etc.; 
(Brescia et al. 1996) ) to astronomy (Barone et al. 1994) 
where it finds wide application in the study of light curves 
of variable stars, AGN's, etc. 

The more sensitive instrumentation and observational 
techniques become, the more frequently we find variable 
signals in time domain that previously were believed to 
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be constant. Research for possible periodicities in the sig- 
nal curves is then a natural consequence, when not an 
important issue. One of the most relevant problems con- 
cerning the techniques of periodic signal analysis is the 
way in which data are collected: many time series are col- 
lected at unevenly sampling rate. We have two types of 
problems related either to unknown fundamental period 
of the data, or their unknown multiple periodicities. Typ- 
ical cases, for instance in Astronomy, are the determina- 
tion of the fundamental period of eclipsing binaries both 
of light and radial velocity curves, or the multiple peri- 
odicities determination of ligth curves of pulsating stars. 
The difficulty arising from unevenly spaced data is rather 
obvious and many attempts have been made to solve the 
problem in a more or less satisfactory way. In this paper 
we will propose a new way to approach the problem us- 
ing neural networks, that seems to work satisfactory well 
and seems to overcome most of the problems encountered 
when dealing with unevenly sampled data. 

2. Spectral analysis and unevenly spaced data 

2.1. Introduction 

In what follows, we assume a; to be a physical variable 
measured at discrete times U. x{ti) can be written as the 
sum of the signal Xs and random errors R: Xi = x(ti) = 
Xs{ti) + R{ti). The problem we are dealing with is how to 
estimate fundamental frequencies which may be present in 
the signal x^iU) (Deeming 1975; Kay 1988; Marple 1987). 
If X is measured at uniform time steps (even sampling) 
(Home & Baliunas 1986; Scargle 1982) there are a lot of 
tools to effectively solve the problem which are based on 
Fourier analysis (Kay 1988; Marple 1987; Oppennheim & 
Schafer 1965). These methods, however, are usually unre- 
liable for unevenly sampled data. For instance, the typical 
approach of resampling the data into an evenly sampled 
sequence, through interpolation, introduces a strong am- 
plification of the noise which affects the effectiveness of all 
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Fourier based techniques which are strongly dependent on 
the noise level (Horowitz 1974). 

There are other techniques used in specific areas 
(Ferraz-Mello 1981; Lomb 1976): however, none of them 
faces directly the problem, so that they are not truly reli- 
able. The most used tool for periodicity analysis of evenly 
or unevenly sampled signals is the Periodogram (Lomb 
1976; Scargle 1982); therefore we will refer to it to evalu- 
ate our system. 



Let us suppose to have N points x{n) and to compute 
mean and variance: 



1 ^ 



N 



1 ^ 

— Y^ {x{n) 



x)^(2) 



Therefore, the normalised Lomb's P (power spectra as 
function of an angular frequency to = 27r/ > 0) is defined 
as follows 



2.2. Periodogram and its variations 

The Periodogram (P), is an estimator of the signal en- 
ergy in the frequency domain (Deeming 1975; Kay 1988; 
Marple 1987; Oppcnnhcim & Schafer 1965). It has been 
extensively applied to pulsating star light curves, unevenly 
spaced, but there are difficulties in its use, specially con- 
cerning with aliasing effects. 



2.2.1. Scargle's Periodogram 

This tool is a variation of the classical P. It was introduced 
by J.D. Scargle (Scargle 1982) for these reasons: 1) data 
from instrumental sampling are often not equally spaced; 
2) due to P inconsistency (Kay 1988; Marple 1987; Op- 
pennheim & Schafer 1965), we must introduce a selection 
criterion for signal components. 

In fact, in the case of even sampling, the classical P 
has a simple statistic distribution: it is exponentially dis- 
tributed for Gaussian noise. In the uneven sampling case 
the distribution becomes very complex. However, Scar- 
gle's P has the same distribution of the even case (Scargle 
1982). Its definition is: 



^N-l 



PM) = 



2 El"o'cos2 2^/(t„-r) 

El-o^x(n)sin2^/(t„-T)]^ 

El"o'sin2 27r/(i„-r) 



(1) 



where 



^ 1 E„^o^sin47r/^n 
47r/^;^j^icos47r/f„ 

and T is a shift variable on the time axis, / is the frequency, 
{x (n) , i„} is the observation series. 
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E„=o (2^(") - ^) sinw(t„ - T)f 
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„=o sm uj(tn - r) 



(3) 



where r is defined by the equation 
tan (2ciJt) 






Y.v-'^ si'^ 2wt„ 



En=o cos2wi„ 

and r is an offset, lo is the frequency, {x (n) , t„} is the ob- 
servation series. The horizontal lines in the figures 19, 22, 
25, 27, 32 and 34 correspond to the practical significance 
levels, as indicated in (Numerical Recipes in C 1988-1992). 

2.3. Modern spectral analysis 

Frequency estimation of narrow band signals in Gaus- 
sian noise is a problem related to many fields (Kay 1988; 
Marple 1987). Since the classical methods of Fourier anal- 
ysis suffer from statistic and resolution problems, then 
newer techniques based on the analysis of the signal au- 
tocorrelation matrix eigenvectors were introduced (Kay 
1988; Marple 1987). 

2.3.1. Spectral analysis with eigenvectors 

Let us assume to have a signal with p sinusoidal com- 
ponents (narrow band). The p sinusoids are modelled as 
a stationary ergodic signal, and this is possible only if 
the phases are assumed to be indipendent random vari- 
ables uniformly distributed in [0, 27r) (Kay 1988; Marple 
1987). To estimate the frequencies we exploit the proper- 
ties of the signal autocorrelation matrix (a.m.) (Kay 1988; 
Marple 1987). The a.m. is the sum of the signal and the 
noise matrices; the p principal eigenvectors of the signal 
matrix allow the estimate of frequencies; the p principal 
eigenvectors of the signal matrix are the same of the total 
matrix. 



2.2.2. Lomb's Periodogram 

This tool is another variation of the classical P and is sim- 
ilar to the Scargle's P. It was introduced by Lomb (Lomb 
1976) and we used the Numerical Recipes in C release 
(Numerical Recipes in C 1988-1992). 



3. PCA Neural Nets 

3.1. Introduction 

Principal Component analysis (PCA) is a widely used 
technique in data analysis. Mathematically, it is defined 
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as follows: let C = E{xx'^) be the covariance matrix of 
L-dimensional zero mean input data vectors x. The z-th 
principal component of x is defined as x-^c(j), where c{i) 
is the normalized eigenvector of C corresponding to the 
j-th largest eigenvalue X{i). The subspace spanned by the 
principal eigenvectors c(l), . . . ,c(M), (A/ < L)) is called 
the PCA subspace (of dimensionality M) (Oja et al. 1991; 
Oja et al. 1996). PCA's can be neurally realized in vari- 
ous ways (Baldi & Hornik 1989; Jutten & Herault 1991; 
Oja 1982; Oja et al. 1991; Plumbley 1993; Sanger 1989). 
The PCA neural network used by us is a one layer feedfor- 
ward neural network which is able to extract the principal 
components of the stream of input vectors. Typically, Heb- 
bian type learning rules are used, based on the one unit 
learning algorithm originally proposed by Oja (Oja 1982). 
Many different versions and extensions of this basic al- 
gorithm have been proposed during the recent years; see 
(Karhunen & Joutsensalo 1994; Karhunen & Joutsensalo 
1995; Oja et al. 1996; Sanger 1989). 

3.2. Linear, robust, nonlinear PCA Neural Nets 

The structure of the PCA neural network can be 
summarised as follows (Karhunen & Joutsensalo 1994; 
Karhunen & Joutsensalo 1995; Oja et al. 1996; Sanger 
1989): there is one input layer, and one forward layer 
of neurons totally connected to the inputs; during the 
learning phase there are feedback links among neurons, 
that classify the network structure as either hierarchical 
or symmetric. After the learning phase the network be- 
comes purely feedforward. The hierarchical case leads to 
the well known GHA algorithm (Karhunen & Joutsensalo 
1995; Sanger 1989); in the symmetric case we have the 
Oja's subspace network (Oja 1982). 

PCA neural algorithms can be derived from optimi- 
sation problems, such as variance maximization and rep- 
resentation error minimisation (Karhunen & Joutsensalo 
1994; Karhunen & Joutsensalo 1995) so obtaining non- 
linear algorithms (and relative neural networks). These 
neural networks have the same architecture of the lin- 
ear ones: either hierarchical or symmetric. These learn- 
ing algorithms can be further classified in: robust PCA 
algorithms and nonlinear PCA algorithms. We define ro- 
bust a PCA algorithm When the objective function grows 
less than quadratically (Karhunen & Joutsensalo 1994; 
Karhunen & Joutsensalo 1995). The nonlinear learning 
function appears at selected places only. In nonlinear PCA 
algorithms all the outputs of the neurons are nonlinear 
function of the responses. 



algorithm: 

Wfc+i(i) = Wk{i) + fJ.kg{yk{i))ek(i), 
ek{i) = Xfe -^2/fe(j)wfe(j) 



(4) 



In the hierarchical case we have I{i) = z. In the sym- 
metric case J(z) = M, the error vector ek(i) becomes the 
same e^ for all the neurons, and equation(y) can be com- 
pactly written as: 



Wfc+i = Wfe + Mefe.g(yD 



(5) 



where y = W^x is the instantaneous vector of neuron re- 
sponses. The learning function g, derivative of/, is applied 
separately to each component of the argument vector. 

The robust generalisation of the representation er- 
ror problem (Karhunen & Joutsensalo 1994; Karhunen & 
Joutsensalo 1995), with f{t) < t'^, leads to the stochastic 
gradient algorithm : 



Wfc+i(i) == Wfe(i) + ^(wfe(i)'^.g(efe(i))xfe 
+ XfeWfc(i)g(efe(i))) 



(6) 



This algorithm can be again considered in both hierarchi- 
cal and symmetric cases. In the symmetric case I(i) = M, 
the error vector is the same (e^) for all the weights Wfe. In 
the hierarchical case I{i) = i, equation(|6|) gives the robust 
counterparts of principal eigenvectors c(i). 



3.2.2. Approximated Algorithms 

The first update term ■Wk{i)'^ g{ek{i))xk in eq.(g) is pro- 
portional to the same vector x^ for all weights Wfe(i). Fur- 
thermore, we can assume that the error vector e^ should 
be relatively small after the initial convergence. Hence, we 
can neglet the first term in equation(@) and this leads to: 



Wfc+l 



(i) = Wfe(i) + nxlyk{i)g{ek{i)) 



(7) 



3.2.3. Nonlinear PCA Algorithms 



Let us consider now the nonlinear extensions of PCA al- 
gorithms. We can obtain them in a heuristic way by re- 
quiring all neuron outputs to be always nonlinear in the 
equation(H) (Karhunen & Joutsensalo 1994; Karhunen & 
Joutsensalo 1995). This leads to: 



3.2.1. Robust PCA algorithms 

In the robust generalization of variance maximisation, the 
objective function f{t) is assumed to be a valid cost func- 
tion (Karhunen & Joutsensalo 1994; Karhunen & Jout- 
sensalo 1995), such as Incos(t) e |i|. This leads to the 



Wk+i{i) = Wk{i) + ng{yk{i))iik{i), 

Hi) 

bfc(i) = Xfc - ^5(2/fc(j))wfe(j) yi = l,...,p 



(8) 
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4. Independent Component Analysis 

Independent Component Analysis (ICA) is a useful exten- 
sion of PCA that was developed in context with source or 
signal separation applications (Oja et al. 1996): instead of 
requiring that the coefficients of a linear expansion of data 
vectors are uncorrelated, in ICA they must be mutually in- 
dependent or as independent as possible. This implies that 
second order moments are not sufficient, but higher order 
statistics are needed in determining ICA. This provides a 
more meaningful representation of data than PCA. In cur- 
rent ICA methods based on PCA neural networks, the fol- 
lowing data model is usually assumed. The L-dimensional 
fc-th data vector x^ is of the form (Oja et al. 1996): 



Xfc = Asfc 



nfc 



M 

E 



Sfc(i)a(i) 



nfc 



(9) 



where in the M-vector s^ — [sfc(l), . . . , Sfc(Af)]^, Sk{i) 
denotes the i-th independent component (source signal) 
at time fc, A = [a(l), . . . , a(M)] is a L x M mixing matrix 
whose columns a(i) are the basis vectors of ICA, and n^ 
denotes noise. 

The source separation problem is now to find an M x L 
separating matrix B so that the Af -vector y^ — Bx^ is an 
estimate yk = s^ of the original independent source signal 
(Oja et al. 1996). 

4-.1. Whitening 

Whitening is a linear transformation A such that, given 
a matrix C, we have AC A = D where D is a diagonal 
matrix with positive elements (Kay 1988; Marple 1987). 

Several separation algorithms utilise the fact that if 
the data vectors x^ are first pre-processed by whitening 
them (i.e. E(xkxJ.) = I with E{.) denoting the expecta- 
tion), then the separating matrix B becomes orthogonal 
(BB^ = I see (Oja et al. 1996)). 

Approximating contrast functions which are max- 
imised for a separating matrix have been introduced be- 
cause the involved probability densities are unknown (Oja 
et al. 1996). 

It can be shown that, for prewhitened input vectors, 
the simpler contrast function given by the sum of kurtoses 
is maximised by a separating matrix B (Oja et al. 1996). 

However, we found that in our experiments the whiten- 
ing was not as good as we expected, because the esti- 
mated frequencies calculated for prewhitened signals with 
the neural estimator (n.e.) were not too much accurate. 

In fact we can pre-elaborate the signal, whitening it, 
and then we can apply the n.e. . Otherwise we can ap- 
ply the whitening and separate the signal in independent 
components with the nonlinear neural algorithm of equa- 
tion(^) and then apply the n.e. to each of these compo- 
nents and estimate the single frequencies separately. 

The first method gives comparable or worse results 
than n.e. without whitening. The second one gives worse 



results and is very expensive. When we used the whitening 
in our n.e. the results were worse and more time consum- 
ing than the ones obtained using the standard n.e. (i.e. 
without whitening the signal). Experimental results are 
given in the following sections. For these reasons whiten- 
ing is not a suitable technique to improve our n.e. . 

5. The neural network based estimator system 

The process for periodicity analysis can be divided in the 
following steps: 

- Preprocessing 

We first interpolate the data with a simple linear fitting 
and then calculate and subtract the average pattern to 
obtain zero mean process (Karhunen & Joutsensalo 1994; 
Karhunen & Joutsensalo 1995). 

- Neural computing 

The fundamental learning parameters are: 

1) the initial weight matrix; 

2) the number of neurons, that is the number of principal 
eigenvectors that we need, equal to twice the number of 
signal periodicities (for real signals); 

3) e, i.e. the threshold parameter for convergence ; 

4) a, the nonlinear learning function parameter; 

5) /i, that is the learning rate. 

We initialise the weight matrix W assigning the classi- 
cal small random values. Otherwise we can use the first 
patterns of the signal as the columns of the matrix: exper- 
imental results show that the latter technique speeds up 
the convergence of our neural estimator (n.e.). However, 
it cannot be used with anomalously shaped signals, such 
as stratigraphic geological signals. 

Experimental results show that a can be fixed to : 
1., 5., 10., 20., even if for symmetric networks a smaller 
value of a is preferable for convergence reasons. Moreover, 
the learning rate jj, can be decreased during the learning 
phase, but we fixed it between 0.05 and 0.0001 in our 
experiments. 

We use a simple criterion to decide if the neural net- 
work has reached the convergence: we calculate the dis- 
tance between the weight matrix at step fc -I- 1, W^+i, 
and the matrix at the previous step W^ , and if this dis- 
tance is less than a fixed error threshold (e) we stop the 
learning process. 

We finally have the following general algorithm in 
which STEP 4 is one of the neural learning algorithms 
seen above in section ||: 

STEP 1 Initialise the weight vectors Wo(i) Vi = 1, . . . ,p 
with small random values, or with orthonormalised sig- 
nal patterns. Initialise the learning threshold e, the 
learning rate ^. Reset pattern counter fc = 0. 

STEP 2 Input the k-th pattern x^ = [x{k), ...,x{k + 
N + 1)] where N is the number of input components. 

STEP 3 Calculate the output for each neuron y{j) = 
w^(j)xi Vi = l,...,p. 
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STEP 4 Modify the weights Wfe+i(z 

fJ'k9{yk{i))ek{i) Vi = l,...,p. 
STEP 5 Calculate 



testnorma = 



p N 



(Wfe+l(ij)- Wfc(ij))2. 



(10) 



STEP 6 Convergence test: if {testnorma < e) then goto 

STEP 8. 
STEP 7 k = k+l. Goto STEP 2. 
STEP 8 End. 

- Frequency estimator 

We exploit the frequency estimator Multiple Signal Clas- 
sificator (MUSIC). It takes as input the weight matrix 
columns after the learning. The estimated signal frequen- 
cies are obtained as the peak locations of the following 
function (Kay 1988; Marple 1987): 



Pmusic = log(: 



1 



l-E-:i|efw(z)| 



(11) 



where w{i) is the i—th weight vector after learn- 



ing, and e^ is the pure sinusoidal vector : e 

[1, 



J2^/ 



^j27r/(L-l)iif 



When / is the frequency of the z— th sinusoidal com- 
ponent, / = /i, we have e = e^ and Pmusic ^ oo. In 
practice we have a peak near and in correspondence of the 
component frequency. Estimates are related to the highest 
peaks. 

6. Music and the Cramer-Rao Lower Bound 

In this section we show the relation between the Music 
estimator and the Cramer-Rao bound following the nota- 
tion and the conditions proposed by Stoica and Nehorai 
in their paper (Stoica and Nehorai 1990). 

6.1. The model 

The problem under consideration is to determine the pa- 
rameters of the following model: 



y(t) = A(0)x(i) + e{t) 



(12) 



where {y(t)} G C™^^ are the vectors of the observed data, 
{x(t)} G (7"xi are the unknown vectors and e(t) G C""^^ 
is the added noise; the matrix A{6) G C""^" and the vector 
9 are given by 



.4(0) = [a(c^i)...a(c^„)] 



kJi...w„ 



(13) 



Now, we reformulate MUSIC to follow the above nota- 
tion. The MUSIC estimate is given by the position of the 
n smallest values of the following function: 



/ (lu) = a* (lu) GG*a (cj) = a* (w) 



I-SS* 



a(c.) (14) 



From equation(14) we can define the estimation error 
of a given parameter. {cDj — uji} has (for big N) an as- 
intotic gaussian distribution, with mean and with the 
following covariance matrix: 

Cmu = ^{Ho ly' Re \h o {A*UAf} {H o ly' (15) 

where Re (x) is the real part of x, where 



H = D*GG*D = D* 



I^A{A*Ay^A' 



D 



and where U is implicitly defined by: 
A*UA = p-i + o-p-i {A*Ay^ P-^ 



(16) 



(17) 



where P is the covariance matrix of x (t) . The elements 
of the diagonal of the matrix Cmu s^re the variances of 
the estimation error. On the other hand, the Cramer-Rao 
lower bound of the covariance matrix of every estimator 
of 9, for large N, is given by: 



CcR^^{Re[HoP^]y 



(18) 



Therefore the statistical efficiency can be defined with the 
condition that P is diagonal as: 



[CMu]ii ^ [Ccflli 



(19) 



where the equality is reached when m increases if and only 
if 



a* (lu) a(w) 



CXD 



as m 



oo. 



For P non-diagonal, [CMu]ii > [C^cr]. 
model used in the spectral analysis 



(20) 
To adapt the 



y{k) = J2A, 



juiik 



-e(fc) 



fc = l,2, 



.M 



(21) 



where M is the total number of samples, to equation(p^ 
we make the following transformations, after fixing an in- 
teger m > p: 



y{t) = [yt 

a(w) 
x(i) 



••■ yt+m-l\ 



[Aie 



jiuit 



A„e^""*] 



i = l, 



,M 



(22) 

^ + 1 



where a (lu) varies with the applications. Our aim is to In this way our model satisfies the conditions of (Stoica 

estimate the unknown parameters of 9. The dimension n and Nehorai 1990). Moreover, equations (|2^) depend on 

of x(t) is supposed to be known a priori and the estimate the choice of m which infiuences the minimum error vari- 

of the parameters of x(i) is easy once 9 is known. ance. 
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Fig. 1. CRB and standard deviation of n.e. estimates; ab- 
scissa is the distance between the frequencies 0^2 and wi . 
CRB (down); standard deviation of n.e. (up) with to = 5, 
a = 0.5 and M = 50. 




Fig. 2. CRB and standard deviation of n.e. estimates; ab- 
scissa is the distance between the frequencies l^i and lo\. 
CRB (down); standard deviation of n.e. (up) with m = 5, 
a = 0.001 and M = 50. 



6.2. Comparison between PCA-MUSIC and the 
Cramer-Rao lower bound 

In this subsection we compare the n.e. method with the 
Cramer-Rao lower bound, by varying the frequencies dis- 
tance, the parameters M and m and the noise variance. 

From the experiments it derives that, fixed M and m, 
by varying the noise (white Gaussian) variance, the n.e. 
estimate is more accurate for small values of the noise 
variance as shown in figures 1-3. For Auj small, the noise 
variance is far from the bound. By increasing to, the es- 
timate improves, but there is a sensitivity to the noise 
(figures 4-6). By varying M, there is a sensitivity of the 
estimator to the number of points and to m (figures 7-8). 
In fact, if we have a quite large number of points we reach 
the bound as illustrated in figures 9-10. 

Therefore, the n.e. estimate depends on both the in- 
crease of m and the number of points in the input se- 
quence. Increasing the number of points, we improve the 
estimate and the error approximates the Cramer-Rao 
bound. On the other hand, for noise variances very small, 
the estimate reaches a very good performance. Finally, we 
see that in all the experiments shown in the figures we 
reach the bound with a good approximation, and we can 
conclude that the n.e. method is statistically efhcient. 

7. Experimental results 

7.1. Introduction 

In this section we show the performance of the neural 
based estimator system using artificial and real data. 
The artificial data are generated following the literature 
(Kay 1988; Marple 1987) and are noisy sinusoidal mix- 



0.35 




Fig. 3. CRB and standard deviation of n.e. estimates; ab- 
scissa is the distance between the frequencies uj2 and toi. 
CRB (down); standard deviation of n.e. (up) with m ~ 5, 
a = 0.0001 and M = 50. 



tures. These are used to select the neural models for the 
next phases and to compare the n.e. with P's, by using 
Montecarlo methods to generate samples. Real data, in- 
stead, come from astrophysics: in fact, real signals are light 
curves of Cepheids and a light curve in the Johnson's sys- 
tem. 



In the sections |7.3| and [7. 4 we use an extension of 
Music to directly include unevenly sampled data without 
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Fig. 4. CRB and standard deviation of n.c. estimates; ab- 
scissa is the distance between the frequencies W2 and uji . 
CRB (down); standard deviation of n.c. (up) with m = 20, 
a = 0.5 and M = 50. 




Fig. 6. CRB and standard deviation of n.c. estimates; ab- 
scissa is the distance between the frequencies lu2 and uji. 
CRB (down); standard deviation of n.e. (up) with m = 20, 
a = 0.0001 and M = 50. 




Fig. 5. CRB and standard deviation of n.e. estimates; ab- 
scissa is the distance between the frequencies W2 and uji . 
CRB (down); standard deviation of n.e. (up) with m = 20, 
a ^ 0.01 and M = 50. 

using the interpolation step of the previous algorithm in 
the following way: 



P' 

^MUSIC 



1 



M-E^=l|efw(^)| = 



(23) 



where p is the frequency number, w(i) is the i— th 
weight vector of the PCA neural network after the 
learning, and e? is the sinusoidal vector: e? = 



[l,ef"^*«,...,ef"^*<^-^>]^ where {io, ti, •.., t(L-i)} are 
the first L components of the temporal coordinates of the 
uneven signal. 




0.02 0.04 0.06 



0.1 0.12 0.14 0.16 0.1B 0,2 
(W2-Wl) 



Fig. 7. CRB and standard deviation of n.e. estimates; ab- 
scissa is the distance between the frequencies uj2 and uji. 
CRB (down); standard deviation of n.e. (up) with m — 5, 
a = 0.01 and M = 20. 



Furthermore, to optimise the performance of the PCA 
neural networks, we stop the learning process when 
Si'=i l®?'^(OP > ^ '^/) so avoiding overfitting prob- 
lems. 



7.2. Models selection 

In this section we use synthetic data to select the neural 
networks used in the next experiments. In this case, the 
uneven sampling is obtained by randomly deleting a fixed 
number of points from the synthetic sinusoid-mixtures: 
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Fig. 8. CRB and standard deviation of n.e. estimates; ab- 
scissa is the distance between the frequencies 0^2 and wi . 
CRB (down); standard deviation of n.e. (up) with to = 5, 
a = 0.01 and M = 20. 
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Fig. 9. CRB and standard deviation of n.e. estimates; ab- 
scissa is the distance between the frequencies L02 and wi . 
CRB (down); standard deviation of n.e. (up) with m — 20, 
a = 0.01 and M = 100. 



this is a widely used technique in the speciahsed literature 
(Home & Baliunas 1986). 

The experiments are organised in this way. First of all, 
we use synthetic unevenly sampled signals to compare the 
different neural algorithms in the neural estimator (n.e.) 
with the Scargle's P. 

For this type of experiments, we realise a statistical 
test using five synthetic signals. Each one is composed by 
the sum of five sinusoids of randomly chosen frequencies in 
[0, 0.5] and randomly chosen phases in [0, 27r] (Kay 1988; 
Karhunen & Joutsensalo 1994; Marple 1987), added to 




Fig. 10. CRB and standard deviation of n.e. estimates; 
abscissa is the distance between the frequencies 102 and 

CRB (down); standard deviation of n.e. (up) with m ~ 50, 
a = 0.001 and M = 100. 



white random noise of fixed variance. We take 200 samples 
of each signal and randomly discard 50% of them (100 
points), getting an uneven sampling (Home & Baliunas 
1986). In this way we have several degree of randomness: 
frequencies, phases, noise, deleted points. 

After this, we interpolate the signal and evaluate the P 
and the n.e. system with the following neural algorithms: 
robust algorithm in equation (0) in the hierarchical and 
symmetric case; nonlinear algorithm in equation(ra) in the 
hierarchical and symmetric case. Each of these is used 
with two nonlinear learning functions: 51 (i) = tanh(at) 
and gi(t) — sgn(t)\o%[\ -I- a|i|). Therefore we have eight 
different neural algorithms to evaluate. 

We chose these algorithms after we made several ex- 
periments involving all the neural algorithms presented in 
section 0, with several learning functions, and we verified 
that the behaviour of the algorithms and learning func- 
tions cited above was the same or better than the others. 
So we restricted the range of algorithms to better show 
the most relevant features of the test. 

We evaluated the average differences between target 
and estimated frequencies. This was repeated for the five 
signals and then for each algorithm we made the average 
evaluation of the single results over the five signals. The 
less this averages were, the greatest the accuracy was. 

We also calculated the average of the number of epochs 
and CPU time for convergence. We compare this with the 
behaviour of P. 

Common signals parameters are: number of frequencies 
— 5, variance noise — 0.5, number of sampled points — 
200, number of deleted points = 100. 
Signal 1: frequencies=0.03, 0.19, 0.25, 0.33, 0.46 1/s 
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Fig. 11. Synthetic Signal. 



Fig. 12. P estimate. 



Signal 2: frequencies=0.02, 0.11, 0.20, 0.33, 0.41 1/s 
Signal 3: frequencies=0.34, 0.29, 0.48, 0.42, 0.04 1/s 
Signal 4: frequencics=0.32, 0.20, 0.45, 0.38, 0.13 1/s 
Signal 5: frequencies=0.02, 0.37, 0.16, 0.49, 0.31 1/s 
Neural parameters: a = 10.0; [i = 0.0001; e = 0.001; num- 
ber of points in each pattern iV = 110 (these are used 
for almost all the neural algorithms; however, for a few of 
them a little variation of some parameters is required to 
achieve convergence). 

Scargle parameters: Tapering = 30%, po = 0.01. 
Results are shown in Table 111: 

We have to spend few words about the differences of 
behaviour among the neural algorithms elicited by the ex- 
periments. Nonlinear algorithms are more complex than 
robust ones; they are relatively slower in converging, with 
higher probability to be caught in local minima, so their 
estimates results are sometimes not reliable. So we restrict 
our choice to robust models. Moreover, symmetric mod- 
els require more effort in finding the right parameters to 
achieve convergence than the hierarchical ones. The per- 
formance, however, are comparable. 

From Table |l| we can see that the best neural algo- 
rithm for our aim is the n.5 in Table y (equation(y) in the 
symmetric case with learning function g\(t) — tanh(at)). 

However, this algorithm requires much more efforts in 
finding the right parameters for the convergence than the 
algorithm n.2 from the same table (equation(0) in the hier- 
archical case with learning function .92 (i) = sgn{t) log(l + 
a|i|)), which has performance comparable with it. 

For this reason, in the following experiments when we 
present the neural algorithm, it is algorithm n.2. 

We show, as an example, in figures 11-13 the estimate 
result of the n.e. algorithm and P on signal n.l. 
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Fig. 13. n.e. estimate. 



We now present the result for whitening pre-processing 
on one synthetic signal (figures 14-16). We compare this 
technique with the standard n.e. . 
Signal frequencies=0.1, 0.15, 0.2, 0.25, 0.3 1/s 
Neural network estimates with whitening: 
0.1,0.15,0.2,0.25,0.3 1/s 

Neural network estimates without whitening: 
0.1,0.15,0.2,0.25,0.3 1/s 

From this and other experiments we saw that when we 
used the whitening in our n.e. the results were worse and 
more time consuming than the ones obtained using the 
standard n.e. (i.e. without whitening the signal). For these 
reasons whitening is not a suitable technique to improve 
our n.e. . 
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Table 1. Performance evaluation of n.e. algorithms and P on synthetic signals. 
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Fig. 14. Synthetic Signal. 
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Fig. 15. n.e. estimate without whitening. 



7. 3. Comparison of the n. e. with the Lomb 's 
Periodogram 

Here we present a set of synthetic signals generated by ran- 
dom varying the noise variance, the phase and the deleted 
points with Montecarlo methods. The signal is a sinusoid 
(0.5cos(27r0.1i + 0) + R{t)) with frequency 0.1 Hz, R{t) 
the Gaussian random noise with mean composed by 100 
points, with a random phase. We follow Home & Baliunas 
(Home & Baliunas 1986) for the choice of the signals. 

We generated two different series of samples depend- 
ing on the number of deleted points: the first one with 
50 deleted points, the second one with 80 deleted points. 
We made 100 experiments for each variance value. The 
results are shown in Table || and Table y, and compared 



with the Lomb's P because it works better than the Scar- 
gle's P with unevenly spaced data, introducing confidence 
intervals which are useful to identify the accepted peaks. 
The results show that both the techniques obtain a 
comparable performance. 



7.4- Real data 

The first real signal is related to the Cepheid SU Cygni 
(Fernie 1979). The sequence was obtained with the pho- 
tometric tecnique UBVRI and the sampling made from 
June to December 1977. The light curve is composed by 
21 samples in the V band, and a period of S.S**, as shown 
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Table 2. Synthetic signal with 50 deleted points, frequency interval [^, ^^^j , MSE = Mean Square Error, T = total 
period {Xmax — ^min) and No = total number of points. 
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Table 3. Synthetic signal with 80 deleted points, frequency interval [^, ^^r^] , MSE = Mean Square Error, T = total 



period {X„ 



Xmin) a-iid No = total number of points. 
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Fig. 16. n.e. estimate with withening. 



Fig. 17. Light curve of SU Cygni. 



in figure 17. In this case, the parameters of the n.e. are: 
N ^10,p^2,a = 20, fi = 0.001. The estimate frequency 
interval is [0(1/71?), 0. 5(1/ JI?)]. The estimated frequency 
is 0.26 (1/JD) in agreement with the Lomb's P, but with- 
out showing any spurious peak (see figures 18 and 19). 

The second real signal is related to the Cepheid U Aql 
(Moffet and Barnes 1980). The sequence was obtained 
with the photometric tecnique BVRI and the sampling 
made from April 1977 to December 1979. The light curve 
is composed by 39 samples in the V band, and a period of 
7.01'', as shown in figure 20. In this case, the parameters 
of the n.e. are: TV = 20, p == 2, a == 5, ^ = 0.001. The 
estimate frequency interval is [0(1/71?), 0. 5(1/ JI?)]. The 
estimated frequency is 0.1425 (1/JD) in agreement with 



the Lomb's P, but without showing any spurious peak (see 
figures 21 and 22). 

The third real signal is related to the Cepheid X Cygni 
(Moffet and Barnes 1980). The sequence was obtained 
with the photometric technique BVRI and the sampling 
made from April 1977 to December 1979. The light curve 
is composed by 120 samples in the V band, and a period 
of 16.38'*, as shown in figure 23. In this case, the parame- 
ters of the n.e. are: iV = 70, p = 2, a = 5, ^ = 0.001. The 
estimate frequency interval is [0(1/ JD), 0.5(1/ JD)]. The 
estimated frequency is 0.061 (1/JD) in agreement with the 
Lomb's P, but without showing any spurious peak (see fig- 
ures 24 and 25). 

The fourth real signal is related to the Cepheid T Mon 
(Moffet and Barnes 1980). The sequence was obtained 
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Fig. 18. n.e. estimate of SU Cygni. 



Fig. 20. Light curve of U Aql. 
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Fig. 19. Lomb's P estimate of SU Cygni. 
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Fig. 21. n.e. estimate of U Aql. 



with the photometric technique BVRI and the samphng 
made from April 1977 to December 1979. The light curve 
is composed by 24 samples in the V band, and a period of 
27.02"^, as shown in figure 26. In this case, the parameters 
of the n.e. are: N = 10, p ^ 2, a ^ 5, ^ = 0.001. The 
estimate frequency interval is [0(1/ JD), 0. 5(1/ JI?)]. The 
estimated frequency is 0.037 (1/JD) (see figure 28). 

The Lomb's P does not work in this case because there 
many peaks, and at least two greater than the threshold 
of the most accurate confidence interval (see figure 27). 

The fifth real signal we used for the test phase is a light 
curve in the Johnson's system (Binnendijk 1960) for the 
eclipsing binary U Peg (see figure 29 and 30). This system 
was observed photoelectrically in the effective wavelengths 
5300 A and 4420 A with the 28-inch reflecting telescope 
of the Flower and Cook Observatory during October and 
November, 1958. 



We made several experiments with the n.e., and we 
elicited a dependence of the frequency estimate on the 
variation of the number of elements for input pattern. 
The optimal experimental parameters for the n.e. are: 
N = 300, a — 5; iJ, = 0.001. The period found by the n.e. is 
expressed in JD and is not in agreement with results cited 
in literature (Binnendijk 1960), (Rigterink 1972), (Zhai et 
al. 1984), (Lu 1985) and (Zhai et al. 1988). The fundamen- 
tal frequency is 5.4 1/JD (see figure 31) instead of 2.7 
1/JD. We obtain a frequency double of the observed one. 
Lomb's P has some high peaks as in the previous exper- 
iments and the estimated frequency is always the double 
of the observed one (see figure 32). 
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Fig. 22. Lomb's P estimate of U Aql. 
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Fig. 24. n.e. estimate of X Cygni. 
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Fig. 23. Light curve of X Cygni. 
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Fig. 25. Lomb's P estimate of X Cygni. 
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8. Conclusions 

We have realised and experimented a new method for 
spectral analysis for unevenly sampled signals based on 
three phases: preprocessing, extraction of principal eigen- 
vectors and estimate of signal frequencies. This is done, 
respectively, by input normalization, nonlinear PCA neu- 
ral networks, and the Multiple Signal Classificator algo- 
rithm. First of all, we have shown that neural networks 
are a valid tool for spectral analysis. 

However, above all, what is really important is that 
neural networks, as realised in our neural estimator sys- 
tem, represent a new tool to face and solve a problem 
tied with data acquisition in many scientific fields: the 
unevenly sampling scheme. 

Experimental results have shown the validity of our 
method as an alternative to Periodogram, and in general 



to classical spectral analysis, mainly in presence of few 
input data, few a priori information and high error prob- 
ability. Moreover, for unevenly sampled data, our system 
offers great advantages with respect to P. First of all, it 
allows us to use a simple and direct way to solve the prob- 
lem as shown in all the experiments with synthetic and 
Cepheid's real signals. Secondly, it is insensitive to the fre- 
quency interval: for example, if we expand our interval in 
the SU Cygni light curve, while our system finds the cor- 
rect frequency, the Lomb's P finds many spurious frequen- 
cies, some of them greater than the confidence threshold, 
as shown in figures 33 and 34. 

Furthermore, when we have a multifrequency signal, 
we can use our system also if we do not know the frequency 
number. In fact, we can detect one frequency at each time 
and continue the processing after the cancellation of the 
detected periodicity by IIR filtering. 



14 



R. Tagliaferri et al.: Spectral Analysis of Stellar Light Curves by Means of Neural Networks 




100 200 300 400 500 600 700 
time (JD) 



0.6 I — - — 

ill 

0-5 I li 
I I 

0.4 I I 

* I* 

I * 
d * * 

> 0.2 1 ii 

* I* 

« i* 
0.1 -* 



-0.1 L 



i« 



I i 



i * 
t t 



10 15 20 25 30 35 40 

time(JD) 



Fig. 26. Light curve of T Mon. 



Fig. 28. n.c. estimate of T Mon. 
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Fig. 27. Lomb's P estimate of T Mon. 



Fig. 29. Light curve of U Peg. 
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A point worth of noting is the failure to find the right 
frequency in the case of eclipsing binary for both our 
method and Lomb's P. Taking account of the morphol- 
ogy of eclipsing light curve with two minima, this fact can 
not be of concern because in practical cases the important 
thing is to have a first guess of the orbital frequency. Fur- 
ther refinement will be possible through a wise planning of 
observations. In any case we have under study this point 
to try to find a method to solve the problem. 
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Fig. 30. Light curve of U Peg (first window). 
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Fig. 34. Lomb's P of SU Cygni with enlarged window. 
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